Although we do need to stick to either, Base R or Tidyverse, let us start by revising some concepts from the tidyverse universe. Here is a cheatsheet - https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf
Recall the data transformation workshop. We will once again need dplyr.
Open the cheatsheet. Let’s go through when we can use gather, spread, separate and unite from the tidyr package.
ggplot, BUT we also learnt about plotly last time, so let’s try this again. It works best with html files such as the one we are trying to create here.
We will create this document, everything from this point on, in R-Markdown
Today we take data from a 2017 survey about people’s preferences for halloween candy.
This is so much candy data, seriously! You can find it here - http://www.scq.ubc.ca/so-much-candy-data-seriously/
Let’s break these steps down into a combination of R code, its output, and some text explaining what is happening
First, open a new markdown file. Load the packages we will be needing today - knitr, tidyverse, ggplot and plotly.
The example below is one way of using data wrangling and data transformation tools to first create a tidy dataset and then explore what people’s cady preferences are.
##### Adding R Code - use {r eval=FALSE} if you do not want the code to run
candyhierarchy2017 <- read.csv("E:/R-Ladies/R-Ladies/candyhierarchy2017.csv", row.names=NULL)
candyhierarchy2017_2<-select(candyhierarchy2017, 7:109)
candyhierarchy2017_Tidy<-gather(candyhierarchy2017_2, "Option", "Answer")
candyhierarchy2017_Tidy$Joy<-ifelse(candyhierarchy2017_Tidy$Answer=="JOY", 1, 0)
candyhierarchy2017_Tidy$Despair<-ifelse(candyhierarchy2017_Tidy$Answer=="DESPAIR", 1, 0)
candyhierarchy2017_Joy<-aggregate(Joy ~ Option, candyhierarchy2017_Tidy, sum)
candyhierarchy2017_Despair<-aggregate(Despair ~ Option, candyhierarchy2017_Tidy, sum)
candyhierarchy2017_Tidy2<-merge(candyhierarchy2017_Joy, candyhierarchy2017_Despair, by="Option")
Markdown let’s you embed plots:
##Note that the `echo = FALSE` parameter prevents printing of the R code that generated the plot.
CandyPlot<-ggplot(candyhierarchy2017_Tidy2, aes(x=Joy, y=Despair)) +
geom_point(aes(col=Option)) +
labs(subtitle="Halloween Candy 2017",
y="Despair",
x="Joy",
title="Scatterplot",
caption = "From Candy Heirarchy Data")
ggplotly(CandyPlot)